1. Discovery

Data we have available for the project is a one-year data set of a Japanese restaurant in California, including order number, item name, order date, order price, order quantity, party size, etc. Using the data that was collected, there are most likely five major components of achievable analyzation target about the restaurant’s menu and delve into the patrons’ tendencies in order to maximize the restaurant’s revenue.

First we’d like to decipher simply what menu items are the most popular. Establishing which menu item gets ordered the most frequently tells us a lot about what people like and what the restaurant can target more in order to capitalize on the popularity of specific types of food. Identifying the varying levels of popularity among the items on the menu will also provide insight on how the restaurant could possibly be wasting time and money on certain items if there’s a clear distinction between popular and unpopular items. The time, money, and labor required to serve items that are not being purchased frequently enough could be directed towards items that we know will sell.

Next, we’d like to look into the distribution of party size and the portion sizes for each item ordered. If it is always larger party ordering larger portion and vice versa for the smaller party, then it is natural and does not affect the popularity of the item. If, for example everyone order a small portion, then there might be something we want to look into. Finding the distinction between party size and the quantity of certain items being ordered will help identify which items are truly popular or not. Larger groups ordering larger quantities doesn’t exactly mean that the item is overly popular if it is just enough to feed the larger parties. Compared to a normal sized party or an individual ordering a certain menu item it could be the same but just inflated due to more people at one table. We want to figure out which items are more popular regardless of the size of the party so we need to account for how party size can skew popularity if you simply assess it based on the quantity ordered.

The third component is the customers’ reactions to price change. How can discounts or specials affect people’s willingness to pay for and buy certain items. Can we pair less popular items with more popular items in order to increase the sale of more menu items. Do the price changes bring enough attention to where a special or a discount will attract people to buying certain foods that will benefit the restaurant.

The fourth component are the sets of foods more popular or are the individual items more popular. Whether people are ordering an item as a part of a larger meal or are people targeting that specific menu item on its own. Will it benefit the restaurant more to pair certain menu items at a larger price or to keep certain menu items by themselves because we know people will by enough of it on their own. Identifying which foods are paired best together and which will seem most appealing to customers based on their previous purchasing records. Looking at the various parties of all different sizes taking note of what was purchased for the table to see if we can coordinate certain sets of food if there’s a clear distinction of foods pairing well.

The fifth component is looking at the prices and quantities of all the menu items ordered to see which items bring in the most money. Popularity based simply on quantity ordered fails to account for the prices of these items. Ensuring that we look at which items brought in the most revenue rather than just the highest number of purchases can vary. Cheaper menu items can be more appealing to customers causing them to buy more of it compared to the items that are pricier that although may get less orders it could still bring in more money for the restaurant. Highlighting the menu items that are popular and bring in the most revenue is a key target of our end goal.

Among our five goals, analyzing price elasticity might be the most difficult to achieve due to technological difficulty and calculating revenue and figuring out popularity would be on the easier side. The ultimate target we would like to achieve is utilizing the result of analyzing popularity and revenue to develop a recommendation system. We would like to develop a recommendation system with association rules to further increase restaurant’s revenue.

2. Data Preparation

Work in Open Refine:
In Open Refine, we corrected all the typos and misspelled words, clustered words items (saleItemName) with the same name but spelled in a different way: (example California Roll, California roll). We merged L/D together because the difference of lunch and dinner does not contribute to our purpose of analyzation. Deleted categoryfordeleteditems and 0 quantity since cancelation is not part of our interest. Outliers are removed, every 3-digits quantity value, we figured that there may be some typos such as instead of 10, maybe it is meant 1, but these are very difficult to find. We looked through the high quantity values (like 82) to see if they could be true or not and removed system generated items such as gift cards. We also removed parties that guest number = 0 or 100.

In this part of codes, we read the file and loaded the necessary packages including lubridate for time analysis, dplyr and forcats for data manipulation, tidyr for reshaping, ggplot for data visualization, arules for association rule, and arulesViz for association rule visualization.

Since all the variables for order number, order quantity and date are characters, we changed all numeric values into numeric variables and date into date data type.

Since the restaurant is located in California, we set the time zone in PST time zone.

We tried to remove all “free” items for setting price = 0., however for some reason even though we tried to set the price > 0.05, free items or items got marked as price of 0.1 did not get to be removed.

We deleted the followings items because some items: “Free Miso Soup, Free Rice, Rice Soup (Free), Kurodai Bone Soup Free and F. Lobster Noodle (Free) separately, although we thought they were since we filtered the dataset for price > 0.05 (their price was supposed to be 0.). Discovered it performing “Items with lowest revenue analysis”.

Remove some items.

We counted how many of orders there are under each category including dining in, to go, online order, pickup order, and delivery.

## # A tibble: 5 × 2
## # Groups:   orderType [5]
##   orderType          n
##   <chr>          <int>
## 1 DINE_IN       223994
## 2 TOGO           13621
## 3 ONLINE_PICKUP     72
## 4 PICKUP            34
## 5 DELIVERY          10
## Rate of DINE-IN orders: 0.94 %
## Rate of NON-DINE-IN orders: 0.06 %

Count how many items in the different menus and how many different categories of items.

## # A tibble: 3 × 2
## # Groups:   menuGroupName [3]
##   menuGroupName      n
##   <chr>          <int>
## 1 All Day       192577
## 2 Lunch          26231
## 3 Dinner         18923
## # A tibble: 34 × 2
## # Groups:   categoryName [34]
##    categoryName             n
##    <chr>                <int>
##  1 Roll                 20840
##  2 Yakitori             19201
##  3 Combination/L        19109
##  4 Rice/Noodle          14874
##  5 Appetizers           14811
##  6 Icho Signature Plate 14409
##  7 Sashimi              13296
##  8 Single Dish          13089
##  9 Wine                 11432
## 10 Combination/D         9584
## # … with 24 more rows

2.1. Item Revenue

Here we analyzed the items that overall brings the highest and lowest revenue to the restaurant.

2.2. Daily Revenue

2.3. Monthly Revenue

3. Model Planning

For our research we will be creating an association-based model to help distinguish popular menu items, how the restaurant can reorganize their menu offerings, and increase overall revenue. It will be broken up to where the structure of the menu and organization of certain items together or separate affect people’s interest in buying different menu items. Identifying relationships and items based on relevance to each other and people’s history of buying certain foods together to make them seem more appealing. In turn this will help increase the quantity purchased of multiple items not just the popular ones. It is broken into two parts essentially, one being an antecedent and the other being a consequent forming an itemset. An antecedent would be one menu item that triggers interest in another and a consequent is the compliment to the other menu item which form an itemset. It’ll help us organize certain pairings that create a suggestive link to where people buy one thing and will likely purchase another item along with it. Similar to the concept of buying popcorn and drinks in a movie theater. The more popcorn you consume, the more you’ll want to drink due to the salt of the popcorn. It’s an association between two items people are drawn to buy one with the other than one by itself. The metrics to help us organize this model will be support, confidence, and lift. Support is the measure of how frequently an itemset occurs with other items. It will be the total number of transactions containing item A and item B by the total number of all the transactions. This metric helps us identify how different menu items link together when compared to all the purchases made. Is it quantifiable that the food pairings or itemsets occur often enough to identify if they have high or low support with one another. The next metric is confidence which is the likeliness of purchasing a menu item if they already have the other. It is equivalent to the total transactions containing item A and B over the transactions contain item A or vice versa. The third metric is the lift metric which acts as a control for identifying if itemsets actually have an associative relationship or if a customer buys menu item A and B without any true support. Lift would be the transactions of item A and B by the transactions containing item A multiplied by the percentage of transactions containing item B. These values help differentiate whether or not the chances of having one item or the other is relational or simply coincidence. Utilizing this model to plan itemsets of food in order to increases customer’s preference to buy one thing after already buying another. This will help the restaurant decide how to structure their menu and what exactly to add or subtract from their menu as well. In our data, the associate model would help us to recommend complementary items to customers by analyzing previous association between different items.

4. Model Building

As for our recommendation system we used Apriori algorithm to determine which itemsets that occur most frequently from the dataset so that we could identify what menu items have the highest probability of customers buying together. We also utilized Eclat, Weclat to be specific. Eclat would give us similar result as the previous, however weclat allow us to weight revenue into the algorithm. Weclat is better than Apriori in this case because setting an higher support, we have even more rules than before. But it is not easy to construct a reccommendation system based on this algorithm.

5. Communicating Results

We use association rule in this part, converting the dataset into trans object to use it for association rules. Since the only columns we need is orderId and saleItemname, we create a data frame with only these two columns, then we renamed the columns as TransNum and Item. We changed these data into factors to have them ordered and to make the transaction object later. We counted the elements just to see if they are the same number as dimension of the transaction eobject. We set names to null because otherwise we will have a transaction with the names of the columns in the data frame myorder. Then we save the file into the environment, extract the file with function read.transactions to create a transaction objects, with single elements with no duplicates.

Starting with the Apriori Algorithm, we got the 10 items with the highest support. Then we set up the apriori algorithm, set support at least 0.01, confidence at least 0.3. We set minlen = 2 to avoid “single” rules such as salmon to salmon. Then we extract and plot the rules. We figure that there are 32 rules in the graph at this level of support and confidence level. It is either a rule with 2 item or 3 items in the rule. Then we plot the rules with lift, showing the rules with higher or lower lift.

6. Operationalize

6.1. Apriori Algorithm

Relative item frequency (support).

Using the apriori function, we extracted the rules from Apriori with support of 0.01 and confidence of 0.3.

We can have a look at the order of the rules (order 2 means a one to one rule, order 3 means a 2 to 1 rule).

And here we can look into detail each rule.

The next two graphs show the first ten rules by confidence. In the first one, we can look at each rule or at each element since it is an interactive plot.

6.2. Recommendation System

We run a for cycle using the same parameters as before, for every item, so that we can have rules for each item.

To see if the algorithm works, we look at the rules for 2 random items.

##     lhs                    rhs                   support    confidence
## [1] {Beef Tongue Stick} => {Chk Cartilage Stick} 0.02579132 0.3128889 
##     coverage   lift    count
## [1] 0.08242966 5.30034 1056
##     lhs                          rhs             support    confidence
## [1] {Yellowtail Belly Sa(1p)} => {O-Toro Sa(1p)} 0.02806272 0.3916155 
## [2] {Yellowtail Belly Sa(1p)} => {Salmon Sa(1p)} 0.02151719 0.3002727 
##     coverage   lift     count
## [1] 0.07165885 8.359910 1149 
## [2] 0.07165885 5.254002  881

6.3. WECLAT

Firstly we create a list of itemsets with a support of at least 0.05 and we have a look at them.

## 
## parameter specification:
##  support minlen maxlen target ext
##     0.05      1     10   <NA>  NA
## 
## algorithmic control:
##  sort verbose
##    NA    TRUE
##     items                     support  
## [1] {Mix Sashimi(Lg)}         0.1923604
## [2] {Yellowtail Belly Sa(1p)} 0.1864464
## [3] {Kumamoto Oyster(1)}      0.1807917
## [4] {Grill Yellowtail Collar} 0.1773892
## [5] {Sweet Shrimp Sa}         0.1607193
## [6] {Live Uni Sa}             0.1581864

We then created association rules with support at least 0.05 and confidence at least 0.3.
Here there are the first 6 rules.

##     lhs                          rhs                       support   
## [1] {Yellowtail Belly Sa(1p)} => {O-Toro Sa(1p)}           0.09785050
## [2] {O-Toro Sa(1p)}           => {Yellowtail Belly Sa(1p)} 0.09785050
## [3] {Yellowtail Belly Sa(1p)} => {Sweet Shrimp Sa}         0.08036788
## [4] {Sweet Shrimp Sa}         => {Yellowtail Belly Sa(1p)} 0.08036788
## [5] {Sweet Shrimp Sa}         => {O-Toro Sa(1p)}           0.07650787
## [6] {O-Toro Sa(1p)}           => {Sweet Shrimp Sa}         0.07650787
##     confidence lift    
## [1] 0.5248184  3.560273
## [2] 0.6638000  3.560273
## [3] 0.4310509  2.682011
## [4] 0.5000512  2.682011
## [5] 0.4760341  3.229329
## [6] 0.5190155  3.229329

WECLAT better than Apriori because with a higher support, we have even more rules than before. But it is not easy to construct a recommendation system based on this algorithm.

7. Appendix

knitr::opts_chunk$set(
    echo = FALSE,
    message = FALSE,
    warning = FALSE
)
library(readxl) # to read the file
library(lubridate) # for dateTime object
library(dplyr) # data manipulation
library(forcats) # data manipulation
library(tidyr) # reshaping
library(ggplot2) # data visualization
library(arules) # association rules
library(arulesViz) # association rules visualization

orders <- read_xlsx("orders.xlsx")
# str(orders) # every variable is a character, change the numeric variables into numeric and the dates into date type
orders$price <- as.numeric(orders$price)
orders$quantity <- as.numeric(orders$quantity)
orders$guestNum <- as.numeric(orders$guestNum)
orders$orderDurationInMinutes <- as.numeric(orders$orderDurationInMinutes)

orders$orderCreatedOn <- mdy_hm(orders$orderCreatedOn, tz = "America/Los_Angeles") # PST Pacific Standard Time because the restaurant is in California
orders$orderClosedOn <- mdy_hm(orders$orderClosedOn, tz = "America/Los_Angeles")


order <- na.omit(orders)

order <- filter(order, price > 0.05)
# summary(order)
order <- filter(order, saleItemName != "Pre-Paid Credit")
order <- filter(order, saleItemName != "Free Rice")
order <- filter(order, saleItemName != "Rice Soup(Free)")
order <- filter(order, saleItemName != "Kurodai Bone Soup Free")
order <- filter(order, saleItemName != "Free Miso Soup")
order <- filter(order, saleItemName != "F.Lobster Noodle(Free)")
# count how many of some types
order %>% 
  group_by(orderType) %>%
  count() %>%
  arrange(desc(n))

# Dine_In = 224042
# rate DINE-IN orders
cat("Rate of DINE-IN orders:", round(224042/nrow(order), 2), "%")

# rate non-DINE-IN orders
cat("Rate of NON-DINE-IN orders:", round(1-224042/nrow(order), 2), "%")

order %>%
  group_by(menuGroupName) %>%
  count() %>%
  arrange(desc(n))

order %>%
  group_by(categoryName) %>%
  count() %>%
  arrange(desc(n))
# Items that bring more money to the restaurant
# create variable REVENUE
order <- order %>% 
  mutate(revenue = price*quantity)

itemRevenue <- order %>%
  group_by(saleItemName) %>%
  summarise(revenue = sum(revenue))

itemRevenue %>%
  filter(revenue > 50000) %>%
  mutate(saleItemName = fct_reorder(saleItemName, revenue)) %>%
  select(saleItemName, revenue) %>%
  ggplot(aes(x=revenue, y=saleItemName)) +
  geom_bar(stat="identity", fill="orange", alpha=.7, width=.6) +
  xlab("Revenue") +
  theme_bw() +
  ggtitle("Items with highest revenue for the restaurant") 
  
itemRevenue %>%
  filter(revenue < 50) %>%
  mutate(saleItemName = fct_reorder(saleItemName, revenue)) %>%
  select(saleItemName, revenue) %>%
  ggplot(aes(x=revenue, y=saleItemName)) +
  geom_bar(stat="identity", fill="orange", alpha=.7, width=.6) +
  xlab("Revenue") +
  theme_bw() +
  ggtitle("Items with lowest revenue for the restaurant")
order1 <- order %>% 
  separate(orderClosedOn, c("dateClosedOn", "timeClosedOn"), sep = " ", remove = F) %>%
  separate(dateClosedOn, c("year", "month", "day"), sep = "-", remove = F)

order2 <- order1 %>% 
  group_by(dateClosedOn) %>%
  summarise(Daily_Revenue = sum(revenue))

order2$dateClosedOn <- ymd(order2$dateClosedOn, tz = "America/Los_Angeles")

# The wide form, remove the first and last observation because they are misleading

order2 <- order2 %>% slice(-358) %>% slice(-1)
plot1 <- order2 %>% ggplot(aes(x = dateClosedOn)) +
  geom_line(aes(y = Daily_Revenue, color = "red")) +
  labs(title="Daily_Revenue") 
plot1
order3 <- order1 %>% 
  group_by(month) %>%
  summarise(Monthly_Revenue = sum(revenue))

order3$month <- as.numeric(order3$month)

order3 <- order3 %>% 
  mutate('MonthName' = month.abb[month])

order3$MonthName <- factor(order3$MonthName, levels = order3$MonthName)

plot2 <- order3 %>% ggplot(aes(x= MonthName, y= Monthly_Revenue)) +
  geom_col(fill="blue", alpha=.7, width=.6) +
  xlab("") +
  theme_bw() +
  ggtitle("Monthly Revenue")
plot2

# Association rules
# convert the dataset into trans object to use it for association rules 
# I need only the columns orderId and saleItemName (3 and 6)

myorder<-order[,c(3,6)]  # create data frame with columns orderId and saleItemName
names(myorder) <- c("TransNum", "Item") # name the columns TransNum (transasction number = orderId) and Item ( = saleItemName)
# changed them into factors to have them ordered and to make the transaction object later
myorder$TransNum = as.factor(myorder$TransNum)
myorder$Item = as.factor(myorder$Item)
# unique(myorder$Item) #537 unique elements # count the elements just to see if they are the same number as dimension of txn
# unique(myorder$TransNum) #40944 unique elements
myorder0 <- myorder 
names(myorder0)<- NULL #set to null because otherwise we will have a transaction with the names of the columns in the dataframe myorder
write.csv(myorder0, file = transact <- file(), row.names = F) # save the file into the environment
txn <- read.transactions(transact, rm.duplicates = FALSE,
                         format = "single", sep = ',', cols = c(1,2)) # extract the file with function read.transactions to create a transaction objects, with single elements with no duplicates.

# dim(txn) # 40944 orders (transactions), 537 items
# relative item frequency (support)
itemFrequencyPlot(txn, topN = 10, cex.names = 1) # 10 items with highest support
rules <- apriori(txn, 
                 parameter = list(supp = 0.01, conf = 0.3,
                                  minlen = 2, # avoid "null" rules
                                  target = "rules")) # apriori algorithm, support at least 0.01, confidence at least 0.3, minlen = 2 means that it will avoid "single" rules (salmon to salmon for example)
inspect(head(rules))#extract the rules
plot(rules, method = "two-key plot")
plot(rules, engine = "plotly")
subrules <- head(rules, n = 10, by = "confidence")
plot(subrules, method = "graph",  engine = "htmlwidget") # best graph ever
plot(subrules, method="paracoord")
# Recommendation using Apriori ----

unique_names <- unique(as.character(myorder$Item)) # create vector of names of items
rules1 <- rep(0,length(unique_names)) 
rules_lhs<-as(rules1,"list") # create a list (we need the list to insert in the list we found)
names(rules_lhs) <- unique_names # create a named list (to each name, after the for cycle, will correspond 0, 1 or more rules)

for (i in 1:length(unique_names)){
  rules_lhs[i] <- apriori(txn,
                       parameter = list(supp=0.01, conf=0.3,
                                        minlen=2),
                       appearance = list(lhs=unique_names[i],
                                         default="rhs")) 
}
# repeat the cycle for every item,
# set IF part (LHS) = names of items, in this way we create a recommendation system for every item (if we select a really low support, we will have the rules for EVERY item, we don't do it because it is computationally expensive)

# if we select an item that does not have a support of at least 0.01, it will not return anything but it will not give an error.
inspect(rules_lhs[["Beef Tongue Stick"]])
inspect(rules_lhs[["Yellowtail Belly Sa(1p)"]])
# Eclat Algorithm (weclat because weighted)
## Mine itemsets with minimum support of 0.1 and 5 or less items
# use function weclat to for weighted associations

weight<-order[,c(3,16)] # select orderId and revenue, and change revenue from revenue of single item to revenue of single order
weight$orderId <- as.factor(weight$orderId)
# unique(order$orderId)

weight <- order %>% 
  group_by(orderId) %>%
  summarise(revenue = sum(revenue)) # change the revenue in the dataset, from revenue to each single item, to revenue for each single order.

weight <- weight$revenue # create the weights ( = revenue) 
txn2 <- txn # transaction object same as before
# dim(txn2)

transactionInfo(txn2) <- data.frame(weight = weight)  # insert into the transaction object, in the "Info" the weights.

## mine weighed support itemsets
s <- weclat(txn2, parameter = list(support = 0.05),
            control = list(verbose = TRUE))   # create the list of items with support at least 0.05 
inspect(head(sort(s))) # 70 items  # look at the items
## create association rules
r <- ruleInduction(s, confidence = 0.3) # create association rules with confidence at least 0.3
inspect(head(sort(r)))  # look at the rules #37 rules